The Cornell TIPSTER Phase III Project

نویسنده

  • F. Ruth Gee
چکیده

The overall objective of the Cornell University TIPSTER Project was to improve end-user efficiency in information retrieval systems by reducing the amount of text that the user must process [1]. The project focuses on high precision IR, near-duplicate detection and context-dependent summarization. The two main foundations of the research are the latest version of the Smart system for information Retrieval and the Empire system for natural language processing. Smart is an implementation of the vector-space model of information retrieval (IR). Its earlier purpose was to provide a framework to conduct IR research but current developments will make the system easier to use by nonresearcher. Empire is a research-oriented system that uses machine learning methods to quickly perform partial parsing of sentences. option, the system will attempt to retrieve fewer documents than it would in a normal search but within the returned hits list, most of the documents should be useful. Emphasis on high precision, however, extracts a penalty in terms of recall. That is, some of the relevant documents or passages that are available in the stored text collection might not be returned to the user since the system will retrieve fewer documents overall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Text Retrieval Conferences (trecs)

Phase III of the TIPSTER project included three workshops for evaluating document detection (information retrieval) projects: the fifth, sixth and seventh Text REtrieval Conferences (TRECs). This work was co-sponsored by the National Institute of Standards and Technology (NIST), and included evaluation not only of the TIPSTER contractors, but also of many information retrieval groups outside of...

متن کامل

TIPSTER Phase III Goals

The primary goal of TIPSTER Phase III is to promote advancements in text processing technologies. To accomplish this goal, the TIPSTER Program will continue to encourage the cooperation of researchers and developers in government, industry and academia to achieve a balanced overall program. The Phase III framework is modeled on that of Phase II and will consist of four basic components: (1) Adv...

متن کامل

The Sri Tipster Iii Project

One step towards ease-of-use by nonexperts was the development reported in Phase II [1] of SRI's FastSpec language which enabled greater facility in generating and modifying the syntactic and semantic patterns necessary for identifying pertinent data. This was a motivating factor for the establishment of the Common Pattern Specification Language (CPSL) Working Group devoted to formulating a CPS...

متن کامل

Automatic Text Summarization in TIPSTER

Automatic Text Summarization was added as a major research thrust of the TIPSTER program during TIPSTER Phase III, 1996-1998. It is a natural extension of the previously supported research efforts in Information Extraction (IE) and Information Retrieval (IR). There is considerable interest in automatically producing summaries due, in large part, to the growth of the Internet and the World Wide ...

متن کامل

The TIPSTER Text Program Overview

These TIPSTER Phase III Proceedings bring to a close a program that had significant impact on information technology. Since 1991, the TIPSTER Text program has fostered the advancement of stateof-the-art technologies for text handling through the efforts of researchers and developers in the U.S. Government, industry and academia. The resulting capabilities are being deployed throughout the intel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998